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Abstract. In this paper we examine the use of topological meth- 
ods for multivariate statistics. Using persistent homology from 
computational algebraic topology, a random sample is used to con- 
struct estimators of persistent homology. This estimation proce- 
dure can then be evaluated using the bottleneck distance between 
the estimated persistent homology and the true persistent homol- 
ogy. The connection to statistics comes from the fact that when 
viewed as a nonparametric regression problem, the bottleneck dis- 
tance is bounded by the sup-norm loss. Consequently, a sharp as- 
ymptotic minimax bound is determined under the sup-norm risk 
over Holder classes of functions for the nonparametric regression 
problem on manifolds. This provides good convergence proper- 
ties for the persistent homology estimator in terms of the expected 
bottleneck distance. 



1. Introduction 

Quantitative scientists of diverse backgrounds are being asked to ap- 
ply the techniques of their specialty to data which is greater in both size 
and complexity than that which has been studied previously. Massive, 
multivariate data sets, for which traditional linear methods are inad- 
equate, pose challenges in representation, visualization, interpretation 
and analysis. A common finding is that these massive multivariate data 
sets require the development of new statistical methodology and that 
these advances are dependent on increasing technical sophistication. 
Two such data-analytic techniques that have recently come to the fore 
are computational algebraic topology and geometric statistics. 
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Commonly, one starts with data obtained from some induced geo- 
metric structure, such as a curved submanifold of a numerical space, 
or, a singular algebraic variety. The observed data is obtained as a 
random sample from this space, and the objective is to statistically 
recover features of the underlying space. 

In computational algebraic topology, one attempts to recover qual- 
itative global features of the underlying data, such as connectedness, 
or the number of holes, or the existence of obstructions to certain con- 
structions, based upon the random sample. In other words, one hopes 
to recover the underlying topology. An advantage of topology is that 
it is stable under deformations and thus can potentially lead to robust 
statistical procedures. A combinatorial construction such as the alpha 
complex or the Cech complex, see for example [33], converts the data 
into an object for which it is possible to compute the topology. How- 
ever, it is quickly apparent that such a construction and its calculated 
topology depend on the scale at which one considers the data. A multi- 
scale solution to this problem is the technique of persistent homology. 
It quantifies the persistence of topological features as the scale changes. 
Persistent homology is useful for visualization, feature detection and 
object recognition. Applications of persistent topology include protein 
structure analysis [30] , gene expression [11] , and sensor networks [8] . In 
a recent application to brain image data, a demonstration of persistent 
topology in discriminating between two populations is exhibited [5] . 

In geometric statistics one uses the underlying Riemannian structure 
to recover quantitative information concerning the underlying probabil- 
ity distribution and functionals thereof. The idea is to extend statistical 
estimation techniques to functions over Riemannian manifolds, utiliz- 
ing the Riemannian structure. One then considers the magnitude of 
the statistical accuracy of these estimators. Considerable progress has 
been achieved in terms of optimal estimation [TJ1 [TJl [X6J [261 HI EH] • 
Other related works include [281 [29], [231 U E]- There is also a growing 
interest in function estimation over manifolds in the learning theory 
literature [II EH [2]; see also the references cited therein. 

Although computational algebraic topology and geometric statistics 
appear dissimilar and seem to have different objectives, it has recently 
been noticed that they share a commonality through statistical sam- 
pling. In particular, a pathway between them can be established by 
using elements of Morse theory. This is achieved through the fact that 
persistent homology can be applied to Morse functions and comparisons 
between two Morse functions can be assessed by a metric called the 
bottleneck distance. Furthermore, the bottleneck distance is bounded 
by the sup-norm distance between the two Morse functions on some 
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underlying manifold. This framework thus provides just enough struc- 
ture for a statistical interpretation. Indeed, consider a nonparametric 
regression problem on some manifold. Given data in this framework 
one can construct a nonparametric regression function estimator such 
that the persistent homology associated with this estimated regression 
function is an estimator for the persistent homology of the true re- 
gression function, as assessed by the bottleneck distance. Since this 
will be bounded by the sup-norm loss, by providing a sharp sup-norm 
minimax estimator of the regression function, we can effectively bound 
the expected bottleneck distance between the estimated persistent ho- 
mology and the true persistent homology. Consequently, by showing 
consistency in the sup-norm risk, we can effectively show consistency 
in the bottleneck risk for persistent homology which is what we will 
demonstrate. Let us again emphasize that the pathway that allows us 
to connect computational algebraic topology with geometric statistics 
is Morse theory. This is very intriguing in that a pathway between the 
traditional subjects of geometry and topology is also Morse theory. 

We now summarize this paper. In Section [2] we will lay down the 
topological preliminaries needed to state our main results. In Section 
[31 we go over the preliminaries needed for nonparametric regression on 
a Riemannian manifold. Section H] states the main results where sharp 
sup-norm minimax bounds consisting of constant and rate, and sharp 
sup-norm estimators are presented. The connection to bounding the 
persistent homology estimators thus ensues. Following this in Section 
[5l a brief discussion of the implementation is given. Proofs to the main 
results are collected in Section An Appendix that contains some 
technical material is included for completeness. 

2. Topological Preliminaries 

Let us assume that M is a d— dimensional compact Riemannian man- 
ifold and suppose / : M — > R is some smooth function. Consider the 
sub level set, or, lower excursion set, 

(2.1) M/< r := {x G M | f(x) <r} = /^((-oo, r]). 

It is of interest to note that for certain classes of smooth functions, 
the topology of M can be approached by studying the geometry of the 
function. 

To be more precise, for some smooth / : M — > M, consider a point 
p G M where in local coordinates the derivatives, df/dxj vanishes. 
Then that point is called a critical point, and the evaluation f(p) is 
called a critical value. A critical point p G M is called non-degenerate if 
the Hessian (d 2 f/didj) is nonsingular. Such functions are called Morse 
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functions. Later we will see that differentiability is not needed when 
approached homologically. 

The geometry of Morse functions can completely characterize the 
homotopy type of M by the way in which topological characteristics 
of sublevel sets (12.11) change at critical points. Indeed classical Morse 
theory tells us that the homotopy type of (12.11) is characterized by at- 
taching a cell whose dimension is determined by the number of negative 
eigenvalues of the Hessian at a critical point to the boundary of the 
set (12.11) at the critical point. This indeed is a pathway that connects 
geometry with topology, and one in which we shall also use to bridge 
statistics. Some background material in topology and Morse theory is 
provided in Appendices [A] and [Bl 

As motivation let us consider a real valued function / that is a mix- 
ture of two bump functions on the disk of radius 10 in IR 2 , see Figure 



Figure 2.1. A mixture of two bump functions and var- 
ious contours below which are the sublevel sets. 

In this example, the maximum of / equals 2, so MLy< 2 = M. This 
sublevel set is the disk and therefore has no interesting topology since 
the disk is contractible. In contrast, consider the sublevel sets when 
r = 1, 1.2, and 1.5 (see Figures E2J O and E3D • 

In these cases, the sublevel sets MLy< r have non-trivial topology, 
namely one, two and one hole(s) respectively, each of whose bound- 
aries is one-dimensional. This topology is detected algebraically by the 
first integral homology group ifi(M/< r ) which will be referred to as 
the homology of degree 1 at level r. This group enumerates the topo- 
logically distinct cycles in the sublevel set. In the first and third cases, 
for each integer z G Z, there is a cycle which wraps around the hole 
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Figure 2.2. The sublevel set at r = 1 has one hole. 




Figure 2.3. The sublevel set at r — 1.2 has two holes. 




Figure 2.4. The sublevel set at r = 1.5 has one hole. 

z times. We have ifi(My< r ) = Z. In the second case, we have two 
generating non-trivial cycles and so Hi(Mf< r ) = Z © Z. For a review 
of homology the reader can consult Appendix El for related discussions. 

2.1. Persistent topology. A computational procedure for determin- 
ing how the homology persists as the level r changes is provided in 
[TP] 155] . In the above example there are two persistent homology classes 
(defined below). One class is born when r = 1.1, the first sublevel set 
that has two holes, and dies at r = 1.4 the first sublevel set for which 
the second hole disappears. The other class is born at r = and persists 
until r = 2. Thus the persistent homology can be completely described 
by the two ordered pairs {(1.1, 1.4), (0, 2)}. This is called the reduced 
persistence diagram (defined below) of /, denoted V(f). For a persis- 
tent homology class described by (a, b), call b — a its lifespan. From the 
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point of view of an experimentalist, a long-lived persistent homology is 
evidence of a significant feature in the data, while a short-lived one is 
likely to be an artifact. 

We now give some precise definitions. 

Definition 2.1. Let A; be a nonnegative integer. Given / : M — > R 
and a < b G R the inclusion of sub level sets i b a : Mj< a > M_f<6 induces 
a map on homology 

H k (£) : H k (M f < a ) ^ H k (M f < b ). 

The image of Hk(i b a ) is the persistent homology group from a to b. Let 
f3 b a be its dimension. This counts the independent homology classes 
which are born by time a and die after time b. 

Call a real number a a homological critical value of / if for all suf- 
ficiently small e > the map H k {i'^L e € ) is not an isomorphism. Call / 
tame if it has finitely many homological critical values, and for each 
aGl, iffc(Mj< a ) is finite dimensional. In particular, any Morse func- 
tion on a compact manifold is tame. 

Assume that / is tame. Choose e smaller than the distance between 
any two homological critical values. For each pair of homological crit- 
ical values a < b, we define their multiplicity /i„ which we interpret as 
the number of independent homology classes that are born at a and die 
at b. We count the homology classes born by time a + e that die after 
time b — e. Among these subtract those born by a — e and subtract 
those that die after b + e. This double counts those born by a — e that 
die after b + e, so we add them back. That is, 

..b _ fib-e _ nb-e _ nb+e , nb+e 
H'a h>a+e h>a-e h>a+e ' > J a-f 

The persistent homology of / may be encoded as follows. The re- 
duced persistence diagram of /, V(f), is the multiset of pairs (a, b) 
together with their multiplicities fi b a . We call this a diagram since it 
is convenient to plot these points on the plane. We will see that it 
is useful to add homology classes which are born and die at the same 
time. Let the persistence diagram of /, T>(f), be given by the union of 
T>{f) and {(a, a)} aG R where each (a, a) has infinite multiplicity. 

2.2. Bottleneck distance. Cohen-Steiner, Edelsbrunner and Harer 
[0] introduced the following metric on the space of persistence dia- 
grams. This metric is called the bottleneck distance and it bounds the 
Hausdorff distance. It is given by 



(2.2) 



d B (V(f),V(g)) = M sup \\ P -j(p)\\ 



STATISTICAL TOPOLOGY 



7 



where the infimum is taken over all bijections 7 : T>(f) — > T>(g) and 
|| ■ I |oo denotes supremum-norm over sets. 

For example, let / be the function considered at the start of this 
section. Let g be a unimodal, radially-symmetric function on the same 
domain with maximum 2.2 at the origin and minimum 0. We showed 
that V(f) = {(1.1, 1.4), (0,2)}. Similarly, V(g) = (0,2.2). The bot- 
tleneck distance is achieved by the bijection 7 which maps (0, 2) to 
(0, 2.2) and (1.1, 1.4) to (1.25, 1.25) and is the identity on all 'diagonal' 
points (a, a). Since the diagonal points have infinite multiplicity this 
is a bijection. Thus, d,B('D(f),T>(g)) = 0.2. 

In [5J, the following result is proven: 

(2.3) d B (V(f),V(g))< 11/ -slice 

where /, g : M — > R are tame functions and || • ||oo denotes sup-norm 
over functions. 



2.3. Connection to Statistics. It is apparent that most articles on 
persistent topology do not as of yet incorporate statistical foundations 
although they do observe them heuristically. The approach in [25] com- 
bines topology and statistics and calculates how much data is needed to 
guarantee recovery of the underlying topology of the manifold. A draw- 
back of that technique is that it supposes that the size of the smallest 
features of the data is known a priori. To date the most comprehensive 
parametric statistical approach is contained in |4]. In this paper, the 
unknown probability distribution is assumed to belong to a parametric 
family of distributions. The data is then used to estimate the level so 
as to recover the persistent topology of the underlying distribution. 

As far as we are aware no statistical foundation for the nonpara- 
metric case has been formulated although [6J provide the topological 
machinery for making a concrete statistical connection. In particular, 
persistent homology of a function is encoded in its reduced persistence 
diagram. A metric on the space of persistence diagrams between two 
functions is available which bounds the Hausdorff distance and this in 
turn is bounded by the sup-norm distance between the two functions. 
Thus by viewing one function as the parameter, while the other is 
viewed as its estimator, the asymptotic sup-norm risk bounds the ex- 
pected Hausdorff distance thus making a formal nonparametric statis- 
tical connection. This in turn lays down a framework for topologically 
classifying clusters in high dimensions. 
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3. NONPARAMETRIC REGRESSION ON MANIFOLDS 

Consider the following nonparametric regression problem 

(3.1) y = f(x) + e, xeM, 

where M is a d— dimensional compact Riemannian manifold, / : M — y 
K. is the regression function and e is a normal random variable with 
mean zero and variance cr 2 > 0. 

For a given sample (yx,Xi), . . . , (y n ,x n ), let / be an estimator of / 
based on the regression model (13. ip . We will assess the estimator's 
performance by the sup-norm loss: 

(3.2) || /- / ||oo=sup|/»-/(x)|. 

Furthermore, we will take as the parameter space, A(/3, L), the class of 
Holder functions 

(3.3) A(/3, L) = {f : M -> R \ \f{x) - f{z) | < Lp(x, zf , x, z G M}, 

where < (5 < 1 and p is the Riemannian metric on M, i.e., p(x, z) is 
the geodesic length (determined by the metric tensor) between x, z G 
M. 

For w(u), a continuous non-decreasing function which increases no 
faster than a power of its argument as u — > oo with w(0) = 0, we define 
the sup-norm minimax risk by 

(3.4) r n (w,0,L) = inf sup Ew^' 1 \\ f - f H^), 

/ /eA(/3,L) 

where the if) n — y is the sup-norm minimax rate, as n — y oo, and E de- 
notes expectation with respect to (13. ip where e is normally distributed. 

3.1. Asymptotic equidistance on manifolds. Consider a set of 
points zi G M, i = 1, ■ • • ,m. We will say that the set of points is 
asymptotically equidistant if 

/ \ r , . (VOIM) 1 ^ 

(3.5) inf p{Zi, Zi) ~ 

i^j m 

as m — y oo for all i,j = 1, . . . ,m, where for two real sequences {a m } 
and {b m }, a> m ~ b m will mean \a m /b m \ — > 1 as m — y oo, this implies 
that 

maxy min^- zj) 



(3-6) 

miiijiniii^^j,^ 

as m — y oo. It will be assumed throughout that the manifold admits a 
collection of asymptotically equidistant points. This is certainly true 
for the sphere (in any dimension), and will be true for all compact 
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Riemannian manifolds since the injectivity radius is strictly positive. 
We note that [2Z| makes use of this condition as well. 
We will need the following constants 

(3.7) 



r - Td/W+d) ( g'volM {I + d)<£ 
C °~ L { vol §^ 



(3.3) *.-(^)^. 

and 'vol' denotes the volume of the object in question, where S d_1 is 
the (d— 1)— dimensional unit sphere with vol S d_1 = 2-n d ^ 2 /T(d/2) and 
T is the gamma function. 

Define the geodesic ball of radius r > centered at z € M by 

(3.9) B z (r) = {x e M\p(x,z) < r} . 

We have the following result whose proof will be detailed in Section 16.11 

Lemma 3.1. Let zi £ M, i — 1, • • • , m, be asymptotically equidistant. 
Let A = A(m) be the largest number such that \J™ =1 B z .(\~ l ) = M, 
where B Zi (\~ l ) is the closure of the geodesic ball of radius A -1 around 
Zi. Then there is a C\ > such that limsup m _ s>00 m\(m)~ d < C\. 

3.2. An estimator. Fix a S > and let 



m 



( L(2P + d) \^ 

1 V 5C Q d*l) n ) 



where C\ is a sufficiently large constant from Lemma 13.11 hence m < n 
and m — > oo when n — > oo and for s E R, [s] denotes the greatest 
integer part. 

For the design points {xi : i — 1, . . . , n} on M, assume that \Xi. G M, j = 1, . . . , m 
is an asymptotically equidistant subset on M. Let Aj,j = 1, . . . , m, be 
a partition of M such that Aj is the set of those igM for which x^. is 
the closest point in the subset {x^, . . . , x im }. Thus, for j = 1, . . . , m, 

(3.10) Aj = \ x e M | p( Xi ,x) = min {p(x ik ,x)}\ . 

I k=l,...,m J 

Let Aj, j = 1, . . . , m be as in (13.101) and define 1aj(x) to be the 
indicator function on the set Aj and consider the estimator 

m 

(3.11) f{x) = ^2a j l Aj (x), 



10 P.BUBENIK, G.CARLSON, P.T. KIM, AND Z-M. LUO 

where for L > 0, < ft < 1, 

Y2i=l K-n.Xi. ( x i)yi 

Chi 



K K>Xlj (u) = (1 - {.Kp{x ij} u)) p ) + , 

and s + = max(s,0), s e R. We remark that when m is sufficiently 
large hence k is also large, the support set of K K>X . (u) is the closed 

geodesic ball B Xi around Xi j for j — 1, . . . , m. 

4. Main Results 

We now state the main results of this paper. The first result pro- 
vides an upper bound for the estimator (13. lip , where the function w(u) 
satisfies w(0) = 0, w{u) = w(—u), w{u) does not decrease, and w{u) 
increases not faster than a power as u — > oo. 

Theorem 4.1. For the regression model (13. ip and the estimator (13. lip . 
we have 

sup Ew U~ l f-f )<w (Co) , 

feA(0,L) v 007 

as n — y 0, where tp n = (n~ l logn) l3 ^ 2l3+d \ 

We have the asymptotic minimax result for the sup-norm risk. 
Theorem 4.2. For the regression model (13 . 1 p 

lim r n (w,ft,L) = w (C ) . 

In particular, we have the immediate result. 
Corollary 4.3. For the regression model (13.11) and the estimator (13. lip . 



sup E 

/6A(/3,i) 

asn->oo. 



Co 



oo \ 72 



logn^ 2 ^ 



We note that the above generalizes earlier one-dimensional results in 
[20| |2T] , where the domain is the unit interval, whereas [18] generalizes 
this result to higher dimensional unit spheres. 

Now that a sharp sup-norm minimax estimator has been found we 
would like to see how we can use this for topological data analysis. The 
key is the sup-norm bound on the bottleneck distance for persistence 
diagrams. In particular, for the regression function / in (13. ip and / the 
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estimator ( 13. lip , we have the persistence diagram V(f) as well as an 
estimator of the persistence diagram T>(f). Using the results of Section 
12.21 an d in particular (12. 3p . we have 

d B (v(f),V(f))< f-f 



(4.1) 

Let A t (/3, L) denote the subset of tame functions in A(/3, L). By corol- 
lary 14.31 the following result is immediate. 

Corollary 4.4. For the nonparametric regression model ( 13. ip , let f be 
defined by (IBTTTj) . Then for < < 1 andL>0, 

sup Efc ( P (/),P(/)) < ( ^»£ + J>* ^ ^ 

as n — )• 0. 



5. Discussion 

To calculate the persistence diagrams of the sublevel sets of /, we 
suggest that because of the way f is constructed, we can calculate 
its persistence diagrams using a triangulation, T of the manifold in 
question. 

We can then filter T using / as follows. Let r% < r 2 < . . . < r m be 
the ordered list of values of / on the vertices of the triangulation. For 
1 < i < m, let % be the subcomplex of T containing all vertices v with 
f{y) < r i and all edges whose boundaries are in % and all faces whose 
boundaries are in %. We obtain the following filtration of T, 

= To c Ti c T 2 c • ■ ■ c T m = T. 

Because the critical points of / only occur at the vertices of T, Morse 
theory guarantees that the persistent homology of the sublevel sets of 
/ equals the persistent homology of the above filtration of T . 

Using the software Plex, [9], we calculate the persistent homology, 
in degrees 0, 1, 2, d of the triangulation T filtered according to 
the estimator. Since the data will be d-dimensional, we do not expect 
any interesting homology in higher degrees, and in fact, most of the 
interesting features would occur in the lower degrees. 

A demonstration of this is provided in [3] for brain image data, where 
the topology of cortical thickness in an autism study takes place. The 
persistent homology, in degrees 0, 1 and 2 is calculated for 27 subjects. 
Since the data is two-dimensional, we do not expect any interesting 
homology in higher degrees. For an initial comparison of the autis- 
tic subjects and control subjects, we take the union of the persistence 
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diagrams, see Fig. 4 in [5] page 392. We note the difference in the 
topological structures as seen through the persistent homologies be- 
tween the autistic and control group, particularly, as we move away 
from the diagonal line. A test using concentration pairings reveal group 
differences. 

6. Proofs 

Our proofs will use the ideas from [18] and |20j . 

6.1. Upper Bound. We first prove the earlier lemma. 

Proof of Lemma \3.1[ Let (U, {x 1 )) be any normal coordinate chart cen- 
tered at Xi, then the components of the metric at are g^ = 5ij, so 
\f\gij{xi) | = 1, see [22]. Consequently, 

vo^R^A" 1 )) = / J\ gij (exp x .(x))\dx = J\gij(exp (t))\ / dx 

~ vol (1(A -1 )) = vol (M(l))\- d = vol (S^X^/d . 

The first line uses the integration transformation, where exp^,. : i?(A _1 ) — > 
B x .{\~ 1 ) is the exponential map from the tangent space TM^. — > M. 
The second line uses the integral mean value theorem and r is the radius 
from the origin to point x in the Euclidean ball B(A _1 ). The third line 
is asymptotic as A — > oo and uses the fact that ^-(exp (t))| — > 1 when 
A — > oo. In the fourth line vol (B(l)) is the volume of d- dimensional Eu- 
clidean unit ball. The last line uses the fact vol (B(l)) = vol (M d ~ 1 )/d. 

Let A' = A'(m) > be the smallest number such that _B X . ((A') -1 ) 
are disjoint. Then A" 1 = c(m) x (A') -1 , where c(m) > 1 and c{m) — > 1 
as m — )• oo. Consequently 

fit 

vol (M) > vol (^.((A') -1 )) ~ mvol (S d - 1 )(A')^/^- 
i=i 

Thus limsup m ^ 00 mA(m)- d = limsup^^ c(m) d m(X')- d < ^^ry. 

□ 

We now calculate the asymptotic variance of dj for j = 1, . . . ,m. 
Let 
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* 2 5X1*^0*) 



var(aj) 



(Si=l ^■K,x ij i x i)) 2 

a 2 vol (S^ («-*)) /^(l - («r)^)y|^(exp^(x))|dx 
M 0^(1 - (^)^) v /|^.(exp^(^^) 2 ' 
This last expression evaluates as 

a\o\ (B x . (k- 1 ))^/!^ (exp^Ct))!/;" 1 £ • • • £ jf(l - {nrffr^drda^ 



so that we have 

a 2 vol (B x ..(K- 1 ))dvo\ {M d ) jf 1 {I - (Krf) 2 r d - l dr 



var(a j ) 



M d 2 vol (B d ) 2 ( / K '(1 - (KrY)r d - l drf 



2 d vol (M)2d(/3 + d) 

— (J K 



nvol (§ rf - 1 )(2/3 + rf) 
as n — )■ 00, where <io"d_i is the spherical measure on S d_1 . 
Lemma 6.1. 

lim P (V || /„ - E/ B |U> (1 + ^Co-^-) = 
Proof. Denote Z n (x) = f n (x) — E/ n (x). Define 
D 2 n = vsx^Z^Xj)) = ^ 2 var(a j ) 



2(5 2 C 2 



d(2/3 + d) \ogn 

Denote y = (l + 6)C 2p/(2p + d). Then 

y 2 _ 2d(l + <5) 2 logn 
D 2 ' 2(3 + d 
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For sufficiently large n, Z n (xj) ~ N(0,ip^D^), hence as n —¥ oo, 



P 



i) n l Z n ||oo> 3/) < P max ^„ 1 |Z n (^-)| >y 



< mP(D-V^»(*i)l>^ 



, 1 V 2 1 f d(l + <5) 2 logn 

< m exp < — > = m exp 



2DI) r 1 2/3 + d 



Therefore 



P (II IU> y) < n-^-^-^-^Oogn)"^^^ (^r^ 

□ 

Lemma 6.2. 



limSUp SUp II / ~ E /n ||oo< (1 + 5)C| 



-"X- X- T-n II J "J"- liuu— V- 1 vOno , 

n->oo /gA(/3,L) Zp + Ct 

Proof. We note that 

II/-E/IU = max sup|/(x)-E/(x)| 

j=l,...,m :rgA 



< max sup ( |/(sc) - f(xj)\ + |E/(a;j) - /(ac,-) 

3 ■ "'.,-:.\- V 



< max |E/(iCj) — + L sup p(x, 

j=l,...,m 



When m is sufficiently large, Aj C B X .(X 1 ), hence by Lemma [3.11 
limsup sup p(x,Xj) < limsup A -1 < limsup I — ] 

Thus 



(C \ 1 

limsup sup ^/j~ 1 p(x,Xj) 13 < limsup ip' 1 I — ) < 

n— too xdAj n-^roo \ fTl J 



L(2/3 + d)' 
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For j = 1, ■ ■ ■ ,m, 



< 



< 



L d d 



K?2f3 + d ur "2/3 + rf 
as n — > oo. □ 

Proof of the upper bound. 

limP(V ||/-/||oo>(l + <*)Co) 

n— loo \ / 

< lim P U- 1 || / - Ef || || Ef - / |U> (1 + S)C ) 

n— loo \ / 

< lim P L~ l || / - Ef IU +(1 + 5)C — ^— > (1 + 5)C ) 

= ii?I p n / - E f »-> d + ^^rb) = 

the second inequality uses Lemma 16.21 and the last line uses Lemma 16.11 
Let g n be the density function of ip' 1 \\ f — f ||oo, then 

lim sup Ew 2 (il)~ l || f n - f || oo) 

(/-(1+<5)C poo 
/ w 2 (x)g n (x)dx + / w 2 (x)g n (x)dx 

Jo J{1+S)C 
/•oo 

.2/ 



?1— >0O 



< + 5)C ) + lim sup / x a g n (x)dx = w\(l + S)C ) < B < oo, 

'(l+5)Co 



where the constant B does not depend on /, the third lines uses the 
assumption on the power growth and non- decreasing property of the 
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loss function w(u). Using the Cauchy-Schwartz inequality, we have 
limsupEw^ 1 || f n - f || oo) 



< W ((l + 5)L7 )limsupP(c 1 II / -/ IU< (l + S)C 

n— too > 

+ hmsup \Ew\ij- 1 \\f n -f lUni'n 1 Wf-f lloo> (1 + S)C )\ 

n. — ^rxn > -* 



II U - f llooJltW II / " / lloc> (1 + djO' j| 1/2 

w((l + 5)C ). 



□ 



6.2. The lower bound. We now prove the lower bound result on 



Lemma 6.3. For sufficiently large k, let N = N(k) be such that 
N — > oo when k — > oo and Xi G M, % = 1, • • • , N, be such that X, are 
asymptotically equidistant, and such that B x .(k~ 1 ) are disjoint. There 
is a constant < D < oo such that 

(6.1) liminf N(n)K~ d > D. 

Proof Let k' > be the largest number such that U^J* .^((/i;')^ 1 ) = 
M. Then 

where c(k) > 1 and c(k) — > const. > 1 as k — > oo. 

JV 

vol (M) < vol (B^k')- 1 )) ~ iVvol (S d - 1 )(« / )~'7 d 

i=l 

Thus 

dvol CM) 



liminf jV(K)/c _d = liminf c(K)~ d N(K,')~ d > const, x 



vol (S^- 1 ) 



□ 



Let J K , X : M -> E, and 

J KjX = Lk~ p K k ^ x (x) = Lk~ p (1 - (nd(x,x))P) + , 

where k > 0,a; G M. Let iV = iV(/t) be the greatest integer such that 
there exists observations 2j G M, z = 1, • • ■ ,N (with possible relabel- 
ing) in the observation set {x^,i = 1, • • ■ , n) such that the functions 
J K)Xi have disjoint supports. From (16.1 \ 

liminf N(K)K~ d > const. 
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Let 

C(K,{xi}) = lj2 d ^i : 1^1 <hi = h'--,N 

where C(k, {xi}) C A(f3,L) when < (3 < 1. The complete class of 
estimators for estimating / G C(k, {xi}) consists of all of the form 

N 

(6.2) U = Y,h J **i 

i=i 

where 0j = 8i(zx, • ■ ■ , zjy), i = 1, • • • , AT, and 



When / n is of the form (16.21) and / G C(k, {£«}) then 

||/n-/||oo > max \f n (Xi) - f(xi)\ = \J K)Xl {xi)\ || 9 - 9 ||oo 

t=l,— ,iV 
= LftT' 3 II 0-0 

Hence 

r n > inf sup Ew^ 1 || /„ - / ||oo) 

fn feC( K ,{xi}) 

> inf sup Ewiip^LK-P \\ 9 - 9 ||oo), 
e \6i\<i 

where the expectation is with respect to a multivariate normal distri- 
bution with mean vector 6 and the variance-covariance matrix ct^-Ijv, 
where Ijv is the NxN identity matrix and o 2 N = vax(zi) = a 2 / J2f=i J^Xi ( x j] 
Fix a small number 8 such that < 8 < 2 and 

( , _ Tdm+d) / (2 8)vo\c/:m ; ■ <i)<n ; ;+ ' /! 

" ' 2vol (S^)/? 2 



and 



Since 



L 



■ - l/l3 



-l -i 



a N = a 



< yj{2- 8){\og{\ogn/n)- d /W+ d )) 

= a/2 - 5^/log(cons x K d ) = V2 - 5 -\/log iV 
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by ( 16. ip . it follows that if 

a N l < V2-5^\ogN 
for some < 5 < 2, then 

inf sup Ew(|| 0-0 ||oo) -> tw(l), 

fl |0 4 |<1 

as iV — >■ oo, but 

By the continuity of the function w, we have 

inf sup Ew^Lk^ || 9 - 9 (U) -> «>(C£), 
e IM<i 

when N — > oo. Since 5 was chosen arbitrarily, the result follows. 

Appendix A. Background on Topology 

In this appendix we present a technical overview of homology as used 
in our procedures. For an intensive treatment we refer the reader to 
the excellent text [52] . 

Homology is an algebraic procedure for counting holes in topological 
spaces. There are numerous variants of homology: we use simplicial 
homology with Z coefficients. Given a set of points V, a ^-simplex is an 
unordered subset {vq, v i, . . . , Vk] where i>$ G V and Vi ^ Vj for all i =fi j. 
The faces of this fc-simplex consist of all (k — l)-simplices of the form 
{vq, . . . , v i-i, v i+ i, . . . , Vk} for some < i < k. Geometrically, the k- 
simplex can be described as follows: given k + 1 points in IR m (m > k), 
the /c-simplex is a convex body bounded by the union of (k — 1) lin- 
ear subspaces of M. m of defined by all possible collections of k points 
(chosen out of k+1 points). A simplicial complex is a collection of sim- 
plices which is closed with respect to inclusion of faces. Triangulated 
surfaces form a concrete example, where the vertices of the triangula- 
tion correspond to V. The orderings of the vertices correspond to an 
orientation. Any abstract simplicial complex on a (finite) set of points 
V has a geometric realization in some M m . Let X denote a simplicial 
complex. Roughly speaking, the homology of X, denoted H*(X), is a 
sequence of vector spaces {H k (X) : k = 0,1,2,3,...}, where H k (X) 
is called the /c-dimensional homology of X. The dimension of H k (X), 
called the k-th Betti number of X, is a coarse measurement of the 
number of different holes in the space X that can be sensed by using 
subcomplexes of dimension k. 

For example, the dimension of Hq(X) is equal to the number of con- 
nected components of X. These are the types of features (holes) in X 
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that can be detected by using points and edges- with this construction 
one is answering the question: are two points connected by a sequence 
of edges or not? The simplest basis for H (X) consists of a choice of 
vertices in X, one in each path-component of X. Likewise, the sim- 
plest basis for Hi(X) consists of loops in X, each of which surrounds 
a hole in X. For example, if X is a graph, then the space H 1 (X) 
encodes the number and types of cycles in the graph, this space has 
the structure of a vector space. Let X denote a simplicial complex. 
Define for each k > 0, the vector space C k {X) to be the vector space 
whose basis is the set of oriented /c-simplices of X; that is, a /c-simplex 
{v , . . . , v k] together with an order type denoted [v , . . . , v k ] where a 
change in orientation corresponds to a change in the sign of the co- 
efficient: [v , . . . , Vi, . . . , vj, . . . , v k ] = -[v , ■ ■ ■ , vj, . . . , Vi, . . . , v k ] if odd 
permutation is used. 

For k larger than the dimension of X, we set Ck{X) = 0. The 
boundary map is defined to be the linear transformation d : C k — > C k -\ 
which acts on basis elements [v , . . . , v k ] via 

k 

(A.l) d[v , ...,v k ] := ^2(-iy[v , v i - 1 ,v i+1 , ...,v k ]. 

This gives rise to a chain complex: a sequence of vector spaces and 
linear transformations 

Consider the following two subspaces of C k : the cycles (those sub- 
complexes without boundary) and the boundaries (those subcomplexes 
which are themselves boundaries) formally defined as: 

• k — cycles: Z k (X) = ker((9 : C k — > C k -\) 

• k — boundaries: B k (X) = im(<9 : C k+ \ — > C k ) 

A simple lemma demonstrates that d o d = 0; that is, the boundary 
of a chain has empty boundary. It follows that B k is a subspace of 
Z k . This has great implications. The /c-cycles in X are the basic 
objects which count the presence of a "hole of dimension k" in X. But, 
certainly, many of the /c-cycles in X are measuring the same hole; still 
other cycles do not really detect a hole at all - they bound a subcomplex 
of dimension k + 1 in X. We say that two cycles ( and rj in Z k (X) are 
homologous if their difference is a boundary: 



[C] = W C-veB k (x). 



20 



P.BUBENIK, G.CARLSON, P.T. KIM, AND Z-M. LUO 



The k- dimensional homology of X, denoted H k (X) is the quotient 
vector space 



(A.2) H t{X) := 

Specifically, an element of H k (X) is an equivalence class of homol- 
ogous /c-cycles. This inherits the structure of a vector space in the 
natural way [(] + [rj] = [C + rj\ and c[(] = [cQ. 

A map / : X — > Y is a homotopy equivalence if there is a map 
g : Y — > X so that fog is homotopic to the identity map on Y and gof 
is homotopic to the identity map on X. This notion is a weakening 
of the notion of homeomorphism, which requires the existence of a 
continuous map g so that fog and gof are equal to the corresponding 
identity maps. The less restrictive notion of homotopy equivalence 
is useful in understanding relationships between complicated spaces 
and spaces with simple descriptions. We say two spaces X and Y are 
homotopy equivalent, or have the same homotopy type if there is a 
homotopy equivalence from X to Y . This is denoted by X ~ Y. 

By arguments utilizing barycentric subdivision, one may show that 
the homology H*(X) is a topological invariant of X: it is indeed an 
invariant of homotopy type. Readers familiar with the Euler character- 
istic of a triangulated surface will not find it odd that intelligent count- 
ing of simplices yields an invariant. For a simple example, the reader 
is encouraged to contemplate the "physical" meaning of Hi(X). Ele- 
ments of H\{X) are equivalence classes of (finite collections of) oriented 
cycles in the 1-skeleton of X, the equivalence relation being determined 
by the 2-skeleton of X. 

Is it often remarked that homology is functorial, by which it is meant 
that things behave the way they ought. A simple example of this which 
is crucial to our applications arises as follows. Consider two simplicial 
complexes X and X'. Let / : X — > X' be a continuous simplicial map: 
/ takes each /c-simplex of X to a /c'-simplex of X', where k' < k. Then, 
the map / induces a linear transformation /# : C k (X) — > C k (X'). It is 
a simple lemma to show that /# takes cycles to cycles and boundaries 
to boundaries; hence there is a well-defined linear transformation on 
the quotient spaces 

/, : H k (X) -+ H k (X>), /„([C]) = [/ # (C)]. 



This is called the induced homomorphism of / on H*. Functoriality 
means that (1) if / : X — )■ Y is continuous then /* : H k (X) — > H k (Y) 
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is a group homomorphism; and (2) the composition of two maps g o f 
induces the composition of the linear transformation: (go f)* = g*° f*- 



The development of Morse theory has been instrumental in classify- 
ing manifolds and represents a pathway between geometry and topol- 
ogy. A classic reference is Milnor |24j. 

For some smooth / : M — > K., consider a point p G M where in 
local coordinates the derivative vanishes, df jdx\ = 0, . . . , df /dxa = 0. 
Then that point is called a critical point, and the evaluation f(p) is 
called a critical value. A critical point p G M is called non- degenerate if 
the Hessian (d 2 f/didj) is nonsingular. Such functions are called Morse 
functions. 

Since the Hessian at a critical point is nondegenerate, there will be 
a mixture of positive and negative eigenvalues. Let 77 be the number 
of negative eigenvalues of the Hessian at a critical point called the 
Morse index. The basic Morse lemma states that at a critical point 
p G M with index rj and some neighborhood U of p, there exists local 
coordinates x = (xi, . . . , Xd) so that x{p) = and 



for all q 6W. 

Based on this result one is able to show that at a critical point p G M, 
with f(p) = a say, that the sublevel set M/< a has the same homotopy 
type as that of the sublevel set Mj< a _ e (for some small e > 0) with 
an ^-dimensional cell attached to it. In fact, for a compact M, its 
homotopy type is that of a cell complex with one 77-dimensional cell 
for each critical point of index rj. This cell complex is known as a CW 
complex in homotopy theory, if the cells are attached in the order of 
their dimension. 

The famous set of Morse inequalities states that if j3k is the k— th 
Betti number and mk is the number of critical points of index k, then 
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f{q) = f{p)-x l {q) 2 



Xr,{q) 2 + x v+1 (q) 2 H x d (q) 2 



ft -A, 

02 ~ Pi + 00 



< 



< 



< 



nil — mo 



m 2 — mi + m 



d 



d 



X(M) = ]T(-1)% = 



fc=0 



where \ denotes the Euler characteristic. 
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